Fix overly restrictive correctness checks in native line readers#16531
Merged
dain merged 3 commits intotrinodb:masterfrom Mar 14, 2023
Merged
Fix overly restrictive correctness checks in native line readers#16531dain merged 3 commits intotrinodb:masterfrom
dain merged 3 commits intotrinodb:masterfrom
Conversation
For legacy reasons, line oriented files may be replaced during query execution. The new file may have a different length, so avoid checking that the length is consistent.
electrum
approved these changes
Mar 13, 2023
Do not check length when detecting a split file, because Hive allows files to be replaced during a query. Instead only the start position is checked. Allow a single header line for split files, because this always works due to the way file split handling works.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Do not verify length when determining if a file is split and instead only check that start offset is zero. This is because Hive supports replacing files during a query, and users take advantage of this. This check is still correct, because a split file necessarily has second split that will have a non-zero start offset.
For all native file formats, hardcode the handling of the isSplittable check as it is a behavior of the code and not the file format.
Fixes #16492
Fixes #16510
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: